-
Notifications
You must be signed in to change notification settings - Fork 6
issue 1111 merge new stac #412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
tests/test_workspace.py
Outdated
@@ -67,7 +67,7 @@ def test_merge_from_disk_new(tmp_path): | |||
for asset_key, asset in item.get_assets().items() | |||
} | |||
assert asset_workspace_uris == { | |||
"asset.tif": f"file:{workspace.root_directory / 'path' / 'to' / 'collection.json_items' / 'asset.tif'}" | |||
"asset.tif": f"file:{workspace.root_directory / 'path' / 'to' / 'collection.json_items' / 'asset.tif' / 'asset.tif'}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was there a reason for now having filename twice in the path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The path of the asset is the collection path + item id + asset filename with item id = asset.tif and asset filename = asset.tif. The merge now takes the path relative to the collection path, so item id + asset filename which is twice the asset.tif. See
openeo-python-driver/tests/test_workspace.py
Line 208 in 302d50f
asset_path = root_path / item.id / asset_filename |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll look into it further but I'm not sure about this.
The problem with the current workspace implementations is that they assume that the asset key contains a relative path (relative to the job directory).
With the introduction of unified asset keys, they no longer have any meaning (could be anything) so this assumption no longer holds and this shortcut in the code has to be removed and implemented in a different way.
It's true that the dummy STAC items in these tests mirror the current implementation in that their item ID == asset key == relative asset path; this is a bit confusing with the new implementation in mind but at the same time the actual item ID should not matter.
I would expect the workspace URI (= the URI of an asset that it gets as it is exported to a workspace) to remain the same: /path/to/collection.json_items/asset.tif and the item ID is not involved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly fine but didn't fully support the asset_per_band
format option; that only became apparent in openeo-geopyspark-driver's test_batch_result.test_export_workspace_merge_filepath_per_band
so I fixed it.
I also made it backwards compatible with the old (non-unified asset keys) implementation. That allows us to keep running the tests for this old implementation (important because this feature is still behind a feature flag) without checking the feature flag here as well.
Open-EO/openeo-geopyspark-driver#1111